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Abstract: For long-duration automated operation of regenerative life support systems in 
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Overview 

• Definitions 

• Challenges of hazardous environment and 
new technology 

• Hazard analysis for complex systems 

• Hazard reduction for control software 

• Future safety-conscious systems 


Definitions 

• Advanced Integration and Control: broadly includes control, 
procedures, schedules, safety, coordination, communication, 
and anomaly response 

• Performance: throughput, latency, efficiency 

• Functionality: level of service 

• Reliability and safety: handling of failures, faults and errors 

- Controlled system, control platform, human operators 

• Vulnerabilities/hazards: unacceptable system weaknesses or 
states that can contribute to a loss 

• Safeguards: methods to prevent or eliminate vulnerabilities or 
hazards and reduce risk (likelihood x severity) 
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Challenges of Hazardous Environment, Maturing 
Technology, and Closed Recycling Systems 

• Incorrect specifications and assumptions are 
inevitable for new technology in harsh conditions 

- Unexpected system states 

- Operators and software will need to solve problems and 
adapt to unavoidable unanticipated situations 

• Complexity and interaction in tight coupling 

- Dynamic interactions in closed set of recycling systems 
with minimal buffers 


System Accidents 

• Interactive systems, tight coupling, complexity 

- Difficulty in analysis leading to unanticipated situations that are 
difficult to understand when they happen 

- Combinations and synergistic effects: common causes, canceling 
failures, side effects, command combinations and timing 

- Interactions in dynamic complex trajectories or histories: distant 
effects, compensating mechanisms 

• Surprise due to mismatch between operations and 

system state 

- Missing information: concealed, ignored/missed 

- Wrong information: misleading, misinterpreted 

• Damaging omissions or errors in control, 

operations or safety response 

- Failure to respond appropriately - not available or misapplied 


Vulnerabilities of Control Software 

• Incompleteness in software requirements 

Incomplete or wrong assumptions about operation of 
controlled system or supporting computer systems 

- Omitted or ambiguous handling of controlled-system states 
and environmental conditions, including violated 
assumptions and overload 

• Software-related hazards 

- Failing to perform required function 

- Performing function inappropriately 

- Failing to coordinate functions (wrong time, wrong order) 

- Failing to respond appropriately to hazardous condition 
(not recognized, wrong response) 


Hazard Analysis for Complex Systems 

• Model-based hazard analysis project 

• Hazard identification tool 

• Simulation to evaluate design and operations 

• Specifying vulnerabilities and safeguards 


Model-based Hazard Analysis for 
Complex Systems 

• Address the problem of safety due to system 
complexity that leads to incomplete requirements 

- Model-Based Hazard Analysis for Interacting Systems - Engineering 
for Complex Systems Program (J. Malin/PI) 

• Guide the engineer in evaluating system designs and 
identifying hazards and hazard scenarios 

• Model, analyze and simulate unanticipated hazards 
and interactions in system operations 

- Effects of faults, errors and failures to act when expected 

• Focus analysis and simulation strategy to find 
unanticipated system accident scenarios 
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Hazards in Generic Component Library 

CONFIG hybrid device models for early design, with selectable 
failure behavior for problem types in system accidents 

- Focus is on theirnohydraulic processing and management of fluids 

• Water Recovery, Air Revitalization, ISRU and Thermal Control cases 

- Capabilities for simulating combined, cascading and global effects of 
local problems 



Selectable Failure Behavior 

• Styles of modeling failures and degradation 

- Discrete changes triggered by failures and problem inputs 

• Immediate or delayed changes to state, behavior mode or control regime 

- Continuous degradation triggered by failures and problem inputs 

- Nontemporal algebraic relations 

• Performance level affected by conditions 

• Failures to operate or change upon input: stuck flags 

• Random variation in measurement or input 

• Degrading and regenerating processing performance 

• Reactors and separators with multi-component mixtures 

- Add and remove contaminants in rapid fluid composition changes 

- Migrate products, gas or liquid to wrong outflow 

- Imbalance process with feed or flow reversal problems 

• Resource providers with alternative methods for reacting to excessive 
demands from multiple loads 

• Leaks as specifiable additions to simulation scenarios 
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Safeguard specification 

• System design protects against vulnerabilities 

- Unacceptable system weaknesses or hazardous states 

- Identify and classify conditions or causes (input) and problem 
effects (output) 

• Safeguards prevent, reduce or mitigate hazards 

- Phase when applied 

• Prevent conditions, prevent evolution to failure state, prevent impacts 
or damage 

• Respond to failure state, respond to impacts 

- Method 

• Isolation and barriers 

• Detection, analysis and control 

• Robustness: buffers/margins, redundancy/multiples, limited impact 
(e.g., fail operational) 

• Repair, renewal or maintenance 
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Automated Data Collection and 
Routine Review for Safeguards 

System Safeguard Knowledge: 

Safeguard Name: Divert to feed tank 
Types: Buffer, control 

Agent: software automation [or hardware or human operator] 

Verification 
type: inspection 

schedule: automated tank level sampling every 5 seconds 
date/time last verified: runtime value 
Measure of success: level goes below alarm level within 2 hours 
Reporting requirements: add record of occunences to daily performance 
summaries 
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Hazard Reduction for Control Software 

• Types of hazard reduction for control software 

• Safety executive for error control 

• Error recovery methods for control software 

• Support for human analysis and intervention 


Hazard Reduction for Control Software 


• Hazard reduction: make control software 

failure less likely 

- Barriers: lockouts, lockins, interlocks 

- Detection and control: make system, control 
software and supporting computer systems 
easier to control and monitor 

- Robustness: redundancy, safety margins and 
error recovery 
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Safety Executive for Error Control 

• Safety kernel or safety executive to centralize and 
encapsulate safety mechanisms 

• Detection of unsafe conditions by external 
application modules 

- Safety assertions, safeguard reports and watchdog 
processes 

• Responsibility for enforcing safety policy and 
deciding safeguard mechanism for handling 
problem 
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Error Recovery for Advanced Control 

• Robustness 

- Robustness and redundancy in data and computation 

- Limited partial shutdowns and reconfigurations 

- Backward recovery (robustness roll back): detect error, 
return to good state (checkpoint) and proceed with 
alternative version 

• Forward recovery (repair): detect and correct 
erroneous state and consequences 

- Intervention and resumption need careful checking 

• Possibility of incorrect assumptions in requirements 
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Support for Human Analysis and Intervention 

• Help operators gain situational awareness (orienting 
for intervention) 

• Help operators manage varying degrees of autonomy 

• Help operators interact with control agent and safety 
executive for intervention 

- Understand policy and choose recovery mechanisms 

- Complete and negotiate abstract or sketched “command” 

• Change monitoring, control, or constraints and priorities for plans 
and procedures 

- Evaluate recovery plans and procedures and associated 
control software changes and commands 

18 


Future Safety Conscious Systems 

• Barriers and robustness to problems 

• Coordination with safety executive and intervening 
human operators 

- Detection with safety assertions, safeguard reports and 
watchdog processes 

- Control with embedded knowledge of vulnerabilities and 
safeguards 

• Response plan evaluation with simulation before 
resuming interrupted operations 

- Simulation scenarios with embedded potential failures 
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